paradigms:
----------

- supervised learning:
  Input to learning is a LABELED set of data, i.e. examples showing how the
  computer should behave. E.g.: computer gets a set of images along with their
  text descriptions, and should learn to write descriptions for new images.

- unsupervised learning:
  Input to learning is UNLABELED set of data and the computer is to find
  patterns in this data. This includes clustering (automatically creating groups
  of similar items), autoencoders (automatically creating a compressing
  algorithm), finding associations between data items, derecting anomalies etc. 

- reinforcement learning:
  The machine receives feedback on how well it is doing from a function and
  tries to maximize its performance. E.g.: computer is trying to design an
  algorithm to solve problem X and gets feedback from a function that returns
  the speed of solving the problem, making the machine look for the fastest
  algorithm for X.

- overfitting:
  Bad thing that happens when you use too complex learning models and/or train
  it too much. This will fit the model exactly to training data but won't
  generalize well to any other data.

Neural Networks (NN)
====================

Learning models inpired by biological neural networks. 

- neuron:

  Model of neuron. One type is perceptron, which is a linear classifier.
  Perceptron has one output and N inputs, and is parametrized by N weights
  w1...1N (each for each input) and a bias B, which define a hyperplane in N
  dimensional space that divides it into two halves. E.g. given 2 inputs we can
  write an equation
  
    input1 * w1 + input2 * w2 + B > 0

  which divides a 2D space by a line into a part in where the neuron is active
  and part where it is not active. Neuron has an activation function after it
  to produce the final output.

- activation function:

  Function that takes an output of a neuron and puts it into <0,1> interval,
  usually a logistic function f(x) = 1 / (1 + e^(-1 * sharpness * x))

                 ______ 1
             .-''
            /
  0 ____..-'

- neural network:

  A set of interconnected neurons, in practice organized to layers so that each
  neuron (except from 1st layer) has inputs from all neurons from previous layer
  and outputs to all neurons in the next layer (except for last layer). Number
  of layers and neurons in the is a lot about experimentation.

- backpropagation:

  Algorithm for training a NN in suprvised learning (generalization exist),
  implementing gradient descent (looking for local minimum in parameter space).


Convolutional Neural Networks (CNN)
===================================

For processing images, also "shift invariant", similar to human vision system.

The input is an image of size W x H and depth D (e.g. 32 x 32 pixels with depth
3 for RGB).

In the network there are these types of layers:

- convolutional layers (CL):

  The neurons that represent close pixels in the image represented by the input
  layer go to a single neuron in this layer, which effectively achieves
  convolution.

  The layer actually performs N convolutions (which are actually cross
  correlations, without transposing the kernel) of input image of depth D0 with
  predefined N convolutional kernels (there can and should be several) of size
  Wn x Hn x D0 (e.g. 5 x 5 pixels x 3 for RGB depth). I.e. the image on the
  input with size W0 x H0 x D0 will be convolved to N images, each of size
  (Wn - Wk + 1) x (Hn - Hk + 1) x D0.

  Example: input image is 32 x 32 x 3, the convolutional layer has 4 kernels,
  each of size 5 x 5 x 3. The output of the layer will be an image 
  28 x 28 x 4. Note the change in depth (changes to the number of kernels).

  Output image obtained by filtering by each kernel is an activation map (since
  convolution finds features in the images similar to the kernel), i.e.
  detection of specific features.

  The first layer in the network is always convolutional (other don't make
  sense). This one detects low-level features (lines, corners etc.). From these
  features each consecutive convolutional layer detects progressively higher
  level features (faces, cars etc.).

- pooling layers (PL):

  Scale down the image by some factor, e.g. 2x. There are different methods to
  do this, the most common being max (take the maximum pixel of each area), but
  average or min can also be used.

- fully connected layers (FC):

  Connect every neuron of previous layer to every neuron in this layer. These
  come at the end of the network after all the convolutional and pooling
  layers and work like a normal general neural network to create the final
  output.
  
The output of the network is a C dimensional array of numbers, C being the
number of classes to which we want to classify. Each element of the array says
the probability of correcponding class.

A typical CNN looks like this:

input image -> CL -> PL -> CL -> PL -> ... -> FC -> FC -> ... -> output

GENERATIVE ADVERSARIAL NETWORKS (GAN)
=====================================

GANs are based on competition of two networks:

1. generative:
   Tries to generate new data that look like the training data. The goal of this
   network is to maximize the error of the generative network (i.e. trick it to
   make it believe its synthesized results are not synthesized).

2. discriminative:
   Tries to learn to distinguish between the real and synthesized data.

As learning goes on, both networks get better: 1. at generating faithful data,
2. at spotting fakes, in turn forcing 1. to get better etc.